49 research outputs found

    Benchmarking the Utility of w-Event Differential Privacy Mechanisms: When Baselines Become Mighty Competitors

    Get PDF
    The w-event framework is the current standard for ensuring differential privacy on continuously monitored data streams. Following the proposition of w-event differential privacy, various mechanisms to implement the framework are proposed. Their comparability in empirical studies is vital for both practitioners to choose a suitable mechanism, and researchers to identify current limitations and propose novel mechanisms. By conducting a literature survey, we observe that the results of existing studies are hardly comparable and partially intrinsically inconsistent. To this end, we formalize an empirical study of w-event mechanisms by re-occurring elements found in our survey. We introduce requirements on these elements that ensure the comparability of experimental results. Moreover, we propose a benchmark that meets all requirements and establishes a new way to evaluate existing and newly proposed mechanisms. Conducting a large-scale empirical study, we gain valuable new insights into the strengths and weaknesses of existing mechanisms. An unexpected - yet explainable - result is a baseline supremacy, i.e., using one of the two baseline mechanisms is expected to deliver good or even the best utility. Finally, we provide guidelines for practitioners to select suitable mechanisms and improvement options for researchers

    On the Complexity of Query Result Diversification

    Get PDF
    Query result diversification is a bi-criteria optimization problem for ranking query results. Given a database D, a query Q and a positive integer k, it is to find a set of k tuples from Q(D) such that the tuples are as relevant as possible to the query, and at the same time, as diverse as possible to each other. Subsets of Q(D) are ranked by an objective function defined in terms of relevance and diversity. Query result diversification has found a variety of applications in databases, information retrieval and operations research. This paper studies the complexity of result diversification for relational queries. We identify three problems in connection with query result diversification, to determine whether there exists a set of k tuples that is ranked above a bound with respect to relevance and diversity, to assess the rank of a given k-element set, and to count how many k-element sets are ranked above a given bound. We study these problems for a variety of query languages and for three objective functions. We establish the upper and lower bounds of these problems, all matching, for both combined complexity and data complexity. We also investigate several special settings of these problems, identifying tractable cases. 1

    Tag-Aware Recommender Systems: A State-of-the-art Survey

    Get PDF
    In the past decade, Social Tagging Systems have attracted increasing attention from both physical and computer science communities. Besides the underlying structure and dynamics of tagging systems, many efforts have been addressed to unify tagging information to reveal user behaviors and preferences, extract the latent semantic relations among items, make recommendations, and so on. Specifically, this article summarizes recent progress about tag-aware recommender systems, emphasizing on the contributions from three mainstream perspectives and approaches: network-based methods, tensor-based methods, and the topic-based methods. Finally, we outline some other tag-related works and future challenges of tag-aware recommendation algorithms.Comment: 19 pages, 3 figure

    Geotag Propagation with User Trust Modeling

    Get PDF
    The amount of information that people share on social networks is constantly increasing. People also comment, annotate, and tag their own content (videos, photos, notes, etc.), as well as the content of others. In many cases, the content is tagged manually. One way to make this time-consuming manual tagging process more efficient is to propagate tags from a small set of tagged images to the larger set of untagged images automatically. In such a scenario, however, a wrong or a spam tag can damage the integrity and reliability of the automated propagation system. Users may make mistakes in tagging, or irrelevant tags and content may be added maliciously for advertisement or self-promotion. Therefore, a certain mechanism insuring the trustworthiness of users or published content is needed. In this chapter, we discuss several image retrieval methods based on tags, various approaches to trust modeling and spam protection in social networks, and trust modeling in geotagging systems. We then consider a specific example of automated geotag propagation system that adopts a user trust model. The tag propagation in images relies on the similarity between image content (famous landmarks) and its context (associated geotags). For each tagged image, similar untagged images are found by the robust graph-based object duplicate detection and the known tags are propagated accordingly. The user trust value is estimated based on a social feedback from the users of the photo-sharing system and only tags from trusted users are propagated. This approach demonstrates that a practical tagging system significantly benefits from the intelligent combination of efficient propagation algorithm and a user-centered trust model
    corecore